First-principles energetics of water: a many-body analysis 
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Standard forms of density-functional theory (DFT) have good predictive power for many materi- 
als, but are not yet fully satisfactory for solid, liquid and cluster forms of water. We use a many-body 
separation of the total energy into its 1-body, 2-body (2B) and beyond- 2-body (B2B) components 
to analyze the deficiencies of two popular DFT approximations. We show how machine-learning 
methods make this analysis possible for ice structures as well as for water clusters. We find that the 
crucial energy balance between compact and extended geometries can be distorted by 2B and B2B 
errors, and that both types of first-principles error are important. 

PACS numbers: 31.15.E-, 61.50.Lt, 71.15.Mb 
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The pioneering work of Parrinello, Car and others in 
the 1990s [l[ initiated a major effort to understand the 
properties of water from first principles using density- 
functional theory (DFT). This effort is important not 
just for pure water, but for general aqueous systems, in- 
cluding solutions [H and water on surfaces 0]. However, 
standard DFT methods often give less than satisfactory 
predictions for water in its liquid crystalline 0, Q 
and cluster forms f° r reasons that are controver- 

sial. Strenuous efforts have been made to overcome the 
problems by adding correction terms, with a recent em- 
phasis on correcting dispersion (see e.g. Refs [(I Hol - [l3 | ) . 
Here we show how a combination of quantum chemistry, 
machine learning and quantum Monte Carlo can be used 
to analyze the energetics of water in a variety of aggrega- 
tion states. We find that DFT approximations often dis- 
tort the energy balance between extended and compact 
structures, and that errors can arise from more than one 
part of the first-principles energy. Technical details of our 
calculations are given in the Supplemental Material [l5[ . 

Our starting point is that a model for the energetics 
of water is not fully satisfactory unless it gives good pre- 
dictions for a range of aggregation states, and particu- 
larly: water clusters (including the dimer) in a variety of 
geometries, ice structures, and the liquid. By this crite- 
rion, standard DFT approximations need improvement, 
since their many errors include: inaccurate predictions of 
energies for some dimer geometries 0, H| ; wrong stability 
ordering of isomers of some clusters, notably the hex- 
amer |9j, llfjfl ; incorrect relative energies of different ice 
structures [fj|; errors of up to 30 % in the predicted den- 
sity of the liquid 

[lilllillllli; and substantial errors m 
the structure and diffusivity of the liquid 0,113, 12-14j. 



as hjl 



Our analysis of water energetics is based on the 
many-body decomposition, in which the total energy 
£tot(l, 2, . . . N) of a system of N monomers is expressed 



£ tot (l,2,...iV) = ^£«( 



5>^-) + £ (>2) (l,2,...A0 
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Here, i is the collection of variables describing monomer 

1 (position, orientation, internal distortion from equi- 
librium geometry), E^ x \i) the 1-body (IB) energy of 
monomer i, E^ 2 \i,j) the 2-body (2B) interaction energy 
of monomers i and j, and the beyond-2-body (B2B) en- 
ergy i?( >2 ) is everything not included in IB and 2B en- 
ergy. In this scheme, disp ersion (non-local correlation) is 
mainly a 2B energy l2fj| . ElT] . though it also contributes 
to the B2B energy |22j. Induction is usually regarded 
as the largest contributor to B2B, though many-body 
exchange-repulsion may also be significant [22j . 

We start with the water dimer, whose energy in any 
geometry can be accurately computed (within 0.1 nxEh ~ 

2 meV ~ 0.05kcal/mol [23j]) using the correlated quantum 
chemistry technique of CCSD(T) (coupled cluster sin- 
gle and double excitations with a perturbative treatment 
of triples) 0. We take the difference 5£( 2 )(DFT) = 
£K 2 )(DFT) - £( 2 )(CCSD(T)) as the error of any DFT 
approximation for the 2B energy of any dimer geometry. 
Following Refs. (jLli^. we study the errors 8E {2 \V>¥T) 
for thermal samples of dimer geometries randomly drawn 
from an m.d. simulation of the liquid, the simulation used 
here being done with the classical AMOEBA force field (2|| , 
as in Ref. 8J. Fig. Q] shows the 2B errors of the widely 
used PBE 23] and BLYP [28| approximations versus O- 
O distance for a sample of 198 dimer configurations fljj ]. 
demonstrating (see also Refs. [l,[29j]) that BLYP system- 
atically underbinds for all distances and molecular orien- 
tations, while PBE performs better (but see Ref. [301). 
We have shown recently [3l| that machine-learning meth- 
ods based on the ideas of Gaussian approximation po- 
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tentials (GAP) [3.2] can be used to represent accurately 
(within 0.1 mE^ ~ 2 meV or better) the IB and 2B er- 
rors of chosen DFT approximations in water systems. We 
illustrate this by including in Fi g. [T] the tiny residual 2B 
errors of GAP-corrected BLYP 15]. This gives a way of 
correcting almost perfectly for the 2B errors of any DFT 
approximation. 
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FIG. 1: Errors of 2-body energy of H2O dimer with BLYP and 
PBE functionals relative to CCSD(T) benchmarks as function 
of O-O distance. Calculations are for sample of 198 dimer 
configurations drawn from a classical m.d. simulation of bulk 
water at ambient conditions. Also shown are residual errors 
of the approximation obtained by adding GAP 2-body cor- 
rections to BLYP. Units: mE^. 



The ability to correct almost exactly for IB and 2B 
errors in the total energy of any water system is invalu- 
able, because it lets us decompose DFT error into its 
IB, 2B and B2B components. We show how this works 
for the isomers of the water hexamer. The energies of 
the prism, cage, book and ring isomers of the hexamer 
(see e. g. R ef. [Sfljbr pictures) have been intensively stud- 



ied 

@, M Hill for an important reason. For smaller 
clusters, the most stable isomers have ring-like geome- 
tries, but from the hexamer onwards compact geome- 
tries are more stable. Highly converged CCSD(T) calcu- 
lations [33l show that the compact prism and cage are 
more stable than the extended book and ring isomers. 
However, standard DFT approximations wrongly make 
the extended geometries more stable [9|: our own cal- 
culations of the total energies (Fig. [2]) illustrate this for 
PBE and BLYP [H]. If we now correct for IB and 2B 
errors by adding the GAP representation of the differ- 
ences DFT - CCSD(T), then the errors of the resulting 
approximations (we denote them by PBE-2 and BLYP- 
2) are by definition B2B errors. As shown in Fig. [2l the 
errors of BLYP-2 are negative but almost constant, so 
that the relative energies are now excellent. The errors 
of PBE-2 are smaller than those of PBE itself, but are 
still not negligible. This means that the erroneous sta- 



bility ordering with BLYP is mainly due to 2B effects, 
but with PBE both 2B and B2B effects are important, 
as pointed out in earlier work [1, 
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FIG. 2: Left panel: total energies of isomers of the H2O hex- 
amer relative to free monomers from PBE, BLYP and bench- 
mark CCSD(T) calculations; right panel: errors of total en- 
ergy of PBE and BLYP and 1- and 2-body corrected PBE-2 
and BLYP-2. Units: mE h 

We turn now to the energetics of ice structures, which 
gives striking evidence of the difficulties of standard first- 
principles methods @. Essentially the same analysis 
that we used for the hexamers helps determine the na- 
ture of DFT errors for the cohesive energies of ice struc- 
tures. Ice has a complicated phase diagram, with no 
less than 15 known crystal structures [34|, but we study 
only the proton-ordered structures XI, II, XV and VIII 
forming the sequence of increasingly compact structures 
found at low temperatures as pressure increases from 
to ~ 20 kbar. The errors of DFT for these and other 
structures have recently been studied in detail [f| , and it 
was shown that the predicted energies increase much too 
fast from extended to compact structures. We illustrate 
this in Fig. 02 where our calculated cohesive energies with 
PBE and BLYP 15] are compared with experimental val- 
ues 35 1 (zero-point energies removed, see also Ref. [H). 



The GAP IB and 2B corrections to PBE and BLYP are 
readily computed in periodic boundary conditions pjlj . 
and Fig. [3] shows the errors of the uncorrected and cor- 
rected approximations for the XI, II, XV and VIII struc- 
tures. The picture resembles what we saw for the hex- 
amers, with BLYP being increasingly underbinding as 
we go from extended to compact structures, but its cor- 
rected version BLYP-2 having almost constant negative 
errors, so that its relative energies are very good. By 
contrast, the corrected version PBE-2, while better than 
PBE itself, still gives substantial errors. This implies 
that for BLYP the problem with relative energies stems 
mainly from systematically underbinding 2B interaction, 
but that B2B errors are also important for PBE. 
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smaller) errors, so that B2B effects are important. 
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FIG. 3: Left panel: energies of ice structures relative to free 
monomers from PBE, BLYP and experiment; right panel: er- 
rors of PBE and BLYP and 1- and 2-body corrected PBE-2 
and BLYP-2 energies. Energy units: mE^. 



The energy changes with increasing compactness along 
the series XI, II, XV, VIII can be understood in more 
detail. In all these structures, each H2O monomer is 
hydrogen-bonded to four first neighbors at 0-0 distances 
of ~ 2.7 A [34]. In XI (the proton-ordered form of the Ih 
structure of common ice) the monomers form a tetrahc- 
dral network, the second-neighbors being at the large dis- 
tance of 4.5 A. From XI to II, XV and VIII, the hydrogen- 
bonded lst-neighbor distances change only slightly, but 
the 2nd-neighbor distances contract dramatically, until in 
VIII each monomer has eight neighbors at almost equal 
distances of ~ 2.8 A, four of which are unbonded to the 
central monomer |34| . The close approach of monomers 
that are not H-bonded to each other in the compact struc- 
tures appears to be implicated in the lar ge D FT errors, 
as has been pointed out before (e.g. Ref. [12[). 

To further probe the DFT errors in the denser ice struc- 
tures, we cut from ice VIII the nonamer composed of an 
H2O molecule and its nearest neighbors, and we study 
the energy changes when the H-bonded neighbors are 
held fixed but the unbonded neighbors are moved radi- 
ally. We calculated benchmark energies for the resulting 
configurations using diffusion Monte Carlo 15j,|36(, which 
is extremely accurate both for water clusters and for ice 
structures p, H, 0], including ice VIII. Comparison with 
PBE and BLYP energies [l| (Fig. HJ shows that both 



approximations give excessive energy increases on going 
from extended to compact configurations. Comparing 
the GAP-corrected approximations PBE-2 and BLYP-2 
with uncorrected PBE and BLYP (Fig. [4j, we see again 
what we learnt from the hexamers and the ice struc- 
tures. Correction for IB and 2B errors takes BLYP from 
severely underbinding to somewhat overbinding, but with 
almost constant B2B errors; corrected PBE, while better 
than uncorrected PBE, still suffers from similar (though 
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FIG. 4: Left panel: total binding energies of the four non- 
amers derived from the ice VIII structure computed with 
DMC, PBE-1 and BLYP-1; non-bonded neighbors move radi- 
ally outward as we go from left to right. Right panel: errors of 
the total energy relative to DMC with PBE-1, BLYP-1, PBE- 
2 and BLYP-2 DFT approximations. Energy units: mEh- 

Several key points emerge from our analysis. First, the 
pervasive importance of the energy balance between ex- 
tended and compact structures is highlighted by the fact 
that we see the same pattern of DFT errors in the hex- 
amers, the ice structures and the nonamer configurations. 
Second, we have seen that both 2B and B2B errors can 
distort this balance. With the BLYP functional, 2B er- 
rors are the main culprit in tipping the balance towards 
extended structures, but with PBE the contribution of 
B2B errors is significant. Third, it is clear that different 
kinds of B2B errors are important. With PBE, the B2B 
errors appear to be associated with the close approach of 
monomers that arc not H-bonded to each other (ice struc- 
tures, nonamer) or that involve highly distorted H-bonds 
(hexamers). With BLYP, by contrast, B2B overbinding 
occurs even for extended structures, and may be due to 
exaggerated cooperativity of H-bonding. Since dispersion 
is expected to be mainly a 2B interaction in water, our 
results clearly indicate that sources of error in addition 
to the poor treatment of dispersion should be considered. 

Is our analysis relevant to the first-principles under- 
standing of the density, structure and diffusivity of liquid 
water? It seems likely that the systematic 2B underbind- 
ing of BLYP accounts both for its undcr-prediction of 
the equilibrium density and for its prediction of an over- 
structured and under-diffusive liquid; however, its B2B 
overbinding might also be expected to contribute to over- 
structuring and under-diffusivity. On the other hand, 
the under-prediction of the density by PBE may arise 
from the excessive 2B and B2B repulsion involving H2O 
monomers that are not H-bonded to each other, an effect 
that could also account for over-structuring and under- 
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diffusivity. These suggestions may be directly testable, 
since diffusion Monte Carlo should be able to provide ac- 
curate total-energy benchmarks for the liquid in periodic 
boundary conditions; if so, the many-body analysis given 
here would be feasible for thermal samples of the liquid. 

Our analysis of first-principles errors for a variety of 
water systems into 1-, 2- and beyond-2-body components 
gives helpful insights into their fundamental energetics, 
but a detailed energy decomposition analysis (see e.g. 
Ref. [22[) is also needed, including an assessment of the 
role of non-local electron correlation in the energetics of 
the dimer and the other water systems studied here. 
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